AITopics

2510.15447

Country:

Asia > Middle East > Jordan (0.05)
Asia > China > Hong Kong (0.05)
North America > United States > New York (0.04)
(2 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Neural Information Processing SystemsMay-27-2025, 01:11:55 GMT

DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering

discrete element learner, neural rendering, particle dynamic, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

arXiv.org Artificial IntelligenceJan-15-2025

The Artificial Scientist -- in-transit Machine Learning of Plasma Simulations

Kelling, Jeffrey, Bolea, Vicente, Bussmann, Michael, Checkervarty, Ankush, Debus, Alexander, Ebert, Jan, Eisenhauer, Greg, Gutta, Vineeth, Kesselheim, Stefan, Klasky, Scott, Pausch, Richard, Podhorszki, Norbert, Poschel, Franz, Rogers, David, Rustamov, Jeyhun, Schmerler, Steve, Schramm, Ulrich, Steiniger, Klaus, Widera, Rene, Willmann, Anna, Chandrasekaran, Sunita

Increasing HPC cluster sizes and large-scale simulations that produce petabytes of data per run, create massive IO and storage challenges for analysis. Deep learning-based techniques, in particular, make use of these amounts of domain data to extract patterns that help build scientific understanding. Here, we demonstrate a streaming workflow in which simulation data is streamed directly to a machine-learning (ML) framework, circumventing the file system bottleneck. Data is transformed in transit, asynchronously to the simulation and the training of the model. With the presented workflow, data operations can be performed in common and easy-to-use programming languages, freeing the application user from adapting the application output routines. As a proof-of-concept we consider a GPU accelerated particle-in-cell (PIConGPU) simulation of the Kelvin- Helmholtz instability (KHI). We employ experience replay to avoid catastrophic forgetting in learning from this non-steady process in a continual manner. We detail challenges addressed while porting and scaling to Frontier exascale system.

particle, radiation, simulation, (15 more...)

2501.03383

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
Europe > Germany > Saxony > Dresden (0.05)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.46)

Industry:

Energy (0.93)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-11-2024

DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering

Wang, Jiaxu, Sun, Jingkai, He, Junhao, Zhang, Ziyi, Zhang, Qiang, Sun, Mingyuan, Xu, Renjing

Learning-based simulators show great potential for simulating particle dynamics when 3D groundtruth is available, but per-particle correspondences are not always accessible. The development of neural rendering presents a new solution to this field to learn 3D dynamics from 2D images by inverse rendering. However, existing approaches still suffer from ill-posed natures resulting from the 2D to 3D uncertainty, for example, specific 2D images can correspond with various 3D particle distributions. To mitigate such uncertainty, we consider a conventional, mechanically interpretable framework as the physical priors and extend it to a learning-based version. In brief, we incorporate the learnable graph kernels into the classic Discrete Element Analysis (DEA) framework to implement a novel mechanics-integrated learning system. In this case, the graph network kernels are only used for approximating some specific mechanical operators in the DEA framework rather than the whole dynamics mapping. By integrating the strong physics priors, our methods can effectively learn the dynamics of various materials from the partial 2D observations in a unified manner. Experiments show that our approach outperforms other learned simulators by a large margin in this context and is robust to different renderers, fewer training samples, and fewer camera views.

artificial intelligence, machine learning, particle, (16 more...)

2410.08983

Country: Asia > China (0.68)

Genre: Research Report (0.81)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Wen, Yuxiao, Vanden-Eijnden, Eric, Peherstorfer, Benjamin

Coupling parameter and particle dynamics for adaptive sampling in Neural Galerkin schemes

arXiv.org Artificial IntelligenceJun-27-2023

Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, especially for transport-dominated and high-dimensional problems that exhibit local features such as waves and coherent structures. Thus, estimators based on data samples from un-informed, uniform distributions are inefficient. This work introduces Neural Galerkin schemes that estimate the training loss with data from adaptive distributions, which are empirically represented via ensembles of particles. The ensembles are actively adapted by evolving the particles with dynamics coupled to the nonlinear parametrizations of the solution fields so that the ensembles remain informative for estimating the training loss. Numerical experiments indicate that few dynamic particles are sufficient for obtaining accurate empirical estimates of the training loss, even for problems with local features and with high-dimensional spatial domains.

approximation, artificial intelligence, machine learning, (16 more...)

2306.1563

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Wu, Diyuan, Kungurtsev, Vyacheslav, Mondelli, Marco

Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence

arXiv.org Artificial IntelligenceFeb-5-2023

The stochastic heavy ball method (SHB), also known as stochastic gradient descent (SGD) with Polyak's momentum, is widely used in training neural networks. However, despite the remarkable success of such algorithm in practice, its theoretical characterization remains limited. In this paper, we focus on neural networks with two and three layers and provide a rigorous understanding of the properties of the solutions found by SHB: \emph{(i)} stability after dropping out part of the neurons, \emph{(ii)} connectivity along a low-loss path, and \emph{(iii)} convergence to the global optimum. To achieve this goal, we take a mean-field view and relate the SHB dynamics to a certain partial differential equation in the limit of large network widths. This mean-field perspective has inspired a recent line of work focusing on SGD while, in contrast, our paper considers an algorithm with momentum. More specifically, after proving existence and uniqueness of the limit differential equations, we show convergence to the global optimum and give a quantitative bound between the mean-field limit and the SHB dynamics of a finite-width network. Armed with this last bound, we are able to establish the dropout-stability and connectivity of SHB solutions.

artificial intelligence, machine learning, shb, (16 more...)

2210.06819

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.49)

Industry: Leisure & Entertainment > Sports > Tennis (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Scheinker, Alexander, Pokharel, Reeju

Physics-constrained 3D Convolutional Neural Networks for Electrodynamics

arXiv.org Machine LearningJan-31-2023

We present a physics-constrained neural network (PCNN) approach to solving Maxwell's equations for the electromagnetic fields of intense relativistic charged particle beams. We create a 3D convolutional PCNN to map time-varying current and charge densities J(r,t) and p(r,t) to vector and scalar potentials A(r,t) and V(r,t) from which we generate electromagnetic fields according to Maxwell's equations: B=curl(A), E=-div(V)-dA/dt. Our PCNNs satisfy hard constraints, such as div(B)=0, by construction. Soft constraints push A and V towards satisfying the Lorenz gauge.

artificial intelligence, constraint, machine learning, (20 more...)

2301.13715

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > Netherlands > Zeeland (0.04)

Genre: Research Report (0.50)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neklyudov, Kirill, Jaini, Priyank, Welling, Max

Particle Dynamics for Learning EBMs

arXiv.org Machine LearningNov-26-2021

Energy-based modeling is a promising approach to unsupervised learning, which yields many downstream applications from a single model. The main difficulty in learning energy-based models with the "contrastive approaches" is the generation of samples from the current energy function at each iteration. Many advances have been made to accomplish this subroutine cheaply. Nevertheless, all such sampling paradigms run MCMC targeting the current model, which requires infinitely long chains to generate samples from the true energy distribution and is problematic in practice. This paper proposes an alternative approach to getting these samples and avoiding crude MCMC sampling from the current model. We accomplish this by viewing the evolution of the modeling distribution as (i) the evolution of the energy function, and (ii) the evolution of the samples from this distribution along some vector field. We subsequently derive this time-dependent vector field such that the particles following this field are approximately distributed as the current density model. Thereby we match the evolution of the particles with the evolution of the energy function prescribed by the learning procedure. Importantly, unlike Monte Carlo sampling, our method targets to match the current distribution in a finite time. Finally, we demonstrate its effectiveness empirically compared to MCMC-based learning methods.

evolution, particle, vector field, (15 more...)

2111.13772

Country: Europe > Netherlands > North Holland > Amsterdam (0.05)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Trillos, Nicolas Garcia, Morales, Felix, Morales, Javier

Traditional and accelerated gradient descent for neural architecture search

arXiv.org Machine LearningJul-2-2020

In this paper, we introduce two algorithms for neural architecture search (NASGD and NASAGD) following the theoretical work by two of the authors [4], which aimed at introducing the conceptual basis for new notions of traditional and accelerated gradient descent algorithms for the optimization of a function on a semi-discrete space using ideas from optimal transport theory. Our methods, which use the network morphism framework introduced in [3] as a baseline, can analyze forty times as many architectures as the hill climbing methods [3, 11] while using the same computational resources and time and achieving comparable levels of accuracy.

architecture, artificial intelligence, machine learning, (16 more...)

2006.15218

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
South America > Venezuela > Capital District > Caracas (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Mei, Song, Misiakiewicz, Theodor, Montanari, Andrea

Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit

arXiv.org Machine LearningFeb-15-2019

We consider learning two layer neural networks using stochastic gradient descent. The mean-field description of this learning dynamics approximates the evolution of the network weights by an evolution in the space of probability distributions in $R^D$ (where $D$ is the number of parameters associated to each neuron). This evolution can be defined through a partial differential equation or, equivalently, as the gradient flow in the Wasserstein space of probability distributions. Earlier work shows that (under some regularity assumptions), the mean field description is accurate as soon as the number of hidden units is much larger than the dimension $D$. In this paper we establish stronger and more general approximation guarantees. First of all, we show that the number of hidden units only needs to be larger than a quantity dependent on the regularity properties of the data, and independent of the dimensions. Next, we generalize this analysis to the case of unbounded activation functions, which was not covered by earlier bounds. We extend our results to noisy stochastic gradient descent. Finally, we show that kernel ridge regression can be recovered as a special limit of the mean field analysis.

inequality, neural network, probability, (16 more...)

1902.06015

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)